1,106 research outputs found

    Transfer Learning for Multi-language Twitter Election Classification

    Get PDF
    Both politicians and citizens are increasingly embracing social media as a means to disseminate information and comment on various topics, particularly during significant political events, such as elections. Such commentary during elections is also of interest to social scientists and pollsters. To facilitate the study of social media during elections, there is a need to automatically identify posts that are topically related to those elections. However, current studies have focused on elections within English-speaking regions, and hence the resultant election content classifiers are only applicable for elections in countries where the predominant language is English. On the other hand, as social media is becoming more prevalent worldwide, there is an increasing need for election classifiers that can be generalised across different languages, without building a training dataset for each election. In this paper, based upon transfer learning, we study the development of effective and reusable election classifiers for use on social media across multiple languages. We combine transfer learning with different classifiers such as Support Vector Machines (SVM) and state-of-the-art Convolutional Neural Networks (CNN), which make use of word embedding representations for each social media post. We generalise the learned classifier models for cross-language classification by using a linear translation approach to map the word embedding vectors from one language into another. Experiments conducted over two election datasets in different languages show that without using any training data from the target language, linear translations outperform a classical transfer learning approach, namely Transfer Component Analysis (TCA), by 80% in recall and 25% in F1 measure

    A Study of Snippet Length and Informativeness: Behaviour, Performance and User Experience

    Get PDF
    The design and presentation of a Search Engine Results Page (SERP) has been subject to much research. With many contemporary aspects of the SERP now under scrutiny, work still remains in investigating more traditional SERP components, such as the result summary. Prior studies have examined a variety of different aspects of result summaries, but in this paper we investigate the influence of result summary length on search behaviour, performance and user experience. To this end, we designed and conducted a within-subjects experiment using the TREC AQUAINT news collection with 53 participants. Using Kullback-Leibler distance as a measure of information gain, we examined result summaries of different lengths and selected four conditions where the change in information gain was the greatest: (i) title only; (ii) title plus one snippet; (iii) title plus two snippets; and (iv) title plus four snippets. Findings show that participants broadly preferred longer result summaries, as they were perceived to be more informative. However, their performance in terms of correctly identifying relevant documents was similar across all four conditions. Furthermore, while the participants felt that longer summaries were more informative, empirical observations suggest otherwise; while participants were more likely to click on relevant items given longer summaries, they also were more likely to click on non-relevant items. This shows that longer is not necessarily better, though participants perceived that to be the case - and second, they reveal a positive relationship between the length and informativeness of summaries and their attractiveness (i.e. clickthrough rates). These findings show that there are tensions between perception and performance when designing result summaries that need to be taken into account

    No need to justify your choice: pre-compiling line breaks to improve eBook readability

    Get PDF
    Implementations of eBooks have existed in one form or another for at least the past 20 years, but it is only in the past 5 years that dedicated eBook hardware has become a mass-market item. New screen technologies, such as e-paper, provide a reading experience similar to those of physical books, and even backlit LCD and OLED displays are beginning to have high enough pixel densities to render text crisply at small point sizes. Despite this, the major element of the physical book that has not yet made the transition to the eBook is high-quality typesetting. The great advantage of eBooks is that the presentation of the page can adapt, at rendering time, to the physical screen size and to the reading preferences of the user. Until now, simple first-fit linebreaking algorithms have had to be used in order to give acceptable rendering speed whilst conserving battery life. This paper describes a system for producing well-typeset, scalable document layouts for eBook readers, without the computational overhead normally associated with better-quality typesetting. We precompute many of the complex parts of the typesetting process, and perform the majority of the ‘heavy lifting’ at document compile-time, rather than at rendering time. Support is provided for floats (such as figures in an academic paper, or illustrations in a novel), for arbitrary screen sizes, and also for arbitrary point-size changes within the text

    Theory and Application of Dissociative Electron Capture in Molecular Identification

    Get PDF
    The coupling of an electron monochromator (EM) to a mass spectrometer (MS) has created a new analytical technique, EM-MS, for the investigation of electrophilic compounds. This method provides a powerful tool for molecular identification of compounds contained in complex matrices, such as environmental samples. EM-MS expands the application and selectivity of traditional MS through the inclusion of a new dimension in the space of molecular characteristics--the electron resonance energy spectrum. However, before this tool can realize its full potential, it will be necessary to create a library of resonance energy scans from standards of the molecules for which EM-MS offers a practical means of detection. Here, an approach supplementing direct measurement with chemical inference and quantum scattering theory is presented to demonstrate the feasibility of directly calculating resonance energy spectra. This approach makes use of the symmetry of the transition-matrix element of the captured electron to discriminate between the spectra of isomers. As a way of validating this approach, the resonance values for twenty-five nitrated aromatic compounds were measured along with their relative abundance. Subsequently, the spectra for the isomers of nitrotoluene were shown to be consistent with the symmetry-based model. The initial success of this treatment suggests that it might be possible to predict negative ion resonances and thus create a library of EM-MS standards.Comment: 18 pages, 7 figure

    Unbiased Comparative Evaluation of Ranking Functions

    Full text link
    Eliciting relevance judgments for ranking evaluation is labor-intensive and costly, motivating careful selection of which documents to judge. Unlike traditional approaches that make this selection deterministically, probabilistic sampling has shown intriguing promise since it enables the design of estimators that are provably unbiased even when reusing data with missing judgments. In this paper, we first unify and extend these sampling approaches by viewing the evaluation problem as a Monte Carlo estimation task that applies to a large number of common IR metrics. Drawing on the theoretical clarity that this view offers, we tackle three practical evaluation scenarios: comparing two systems, comparing kk systems against a baseline, and ranking kk systems. For each scenario, we derive an estimator and a variance-optimizing sampling distribution while retaining the strengths of sampling-based evaluation, including unbiasedness, reusability despite missing data, and ease of use in practice. In addition to the theoretical contribution, we empirically evaluate our methods against previously used sampling heuristics and find that they generally cut the number of required relevance judgments at least in half.Comment: Under review; 10 page

    The Energy of the Gamma Metric in the M{\o}ller Prescription

    Get PDF
    We obtain the energy distribution of the gamma metric using the energy-momentum complex of M{\o}ller. The result is the same as obtained by Virbhadra in the Weinberg prescription

    Regular and Chaotic Motion in General Relativity: The Case of a Massive Magnetic Dipole

    Full text link
    Circular motion of particles, dust grains and fluids in the vicinity of compact objects has been investigated as a model for accretion of gaseous and dusty environment. Here we further discuss, within the framework of general relativity, figures of equilibrium of matter under the influence of combined gravitational and large-scale magnetic fields, assuming that the accreted material acquires a small electric charge due to interplay of plasma processes and photoionization. In particular, we employ an exact solution describing the massive magnetic dipole and we identify the regions of stable motion. We also investigate situations when the particle dynamics exhibits the onset of chaos. In order to characterize the measure of chaoticness we employ techniques of Poincar\'e surfaces of section and of recurrence plots.Comment: 11 pages, 6 figures, published in the proceedings of the conference "Relativity and Gravitation: 100 Years after Einstein in Prague" (25. - 29. 6. 2012, Prague

    On the Parity Problem in One-Dimensional Cellular Automata

    Full text link
    We consider the parity problem in one-dimensional, binary, circular cellular automata: if the initial configuration contains an odd number of 1s, the lattice should converge to all 1s; otherwise, it should converge to all 0s. It is easy to see that the problem is ill-defined for even-sized lattices (which, by definition, would never be able to converge to 1). We then consider only odd lattices. We are interested in determining the minimal neighbourhood that allows the problem to be solvable for any initial configuration. On the one hand, we show that radius 2 is not sufficient, proving that there exists no radius 2 rule that can possibly solve the parity problem from arbitrary initial configurations. On the other hand, we design a radius 4 rule that converges correctly for any initial configuration and we formally prove its correctness. Whether or not there exists a radius 3 rule that solves the parity problem remains an open problem.Comment: In Proceedings AUTOMATA&JAC 2012, arXiv:1208.249
    • 

    corecore